CONTENTS

Chapter 14. Server-Side Includes

Server-side includes trigger further actions whose output, if any, may then be placed inline into served documents or affect subsequent includes. The same results could be achieved by CGI scripts — either shell scripts or specially written C programs — but server-side includes often achieve these results with a lot less effort. There are, however, some security problems. The range of possible actions is immense, so we will just give basic illustrations of each command in a number of text files in ...site.ssi/htdocs.

The Config file, .../conf/httpd1.conf, is as follows:

User webuser
Group webgroup
ServerName www.butterthlies.com
DocumentRoot /usr/www/APACHE3/site.ssi/htdocs
ScriptAlias /cgi-bin /usr/www/APACHE3/cgi-bin
AddHandler server-parsed shtml
Options +Includes

Run it by executing ./go 1.

shtml is the normal extension for HTML documents with server-side includes in them and is found as the extension to the relevant files in ... /htdocs. We could just as well use brian or dog_run, as long as it appears the same in the file with the relevant command and in the configuration file. Using html can be useful — for instance, you can easily implement site-wide headers and footers — but it does mean that every HTML page gets parsed by the SSI engine. On busy systems, this could reduce performance.

Bear in mind that HTML generated by a CGI script does not get put through the SSI processor, so it's no good including the markup listed in this chapter in a CGI script.

Options Includes turns on processing of SSIs. As usual, look in the error_log if things don't work. The error messages passed to the client are necessarily uninformative since they are probably being read three continents away, where nothing useful can be done about them.

The trick of SSI is to insert special strings into our documents, which then get picked up by Apache on their way through, tested against reference strings using =, !=, <, <=, >, and >=, and then replaced by dynamically written messages. As we will see, the strings have a deliberately unusual form so they won't get confused with more routine stuff. This is the syntax of a command:

<!--#element attribute="value" attribute="value" ... -->

The Apache manual tells us what the elements are:

config

This command controls various aspects of the parsing. The valid attributes are as follows:

errmsg

The value is a message that is sent back to the client if an error occurs during document parsing.

sizefmt

The value sets the format to be used when displaying the size of a file. Valid values are bytes for a count in bytes or abbrev for a count in kilobytes or megabytes, as appropriate.

timefmt

The value is a string to be used by the strftime( ) library routine when printing dates.

echo

This command prints one of the include variables, defined later in this chapter. If the variable is unset, it is printed as (none). Any dates printed are subject to the currently configured timefmt. This is the only attribute:

var

The value is the name of the variable to print.

exec

The exec command executes a given shell command or CGI script. Options IncludesNOEXEC disables this command completely — a boon to the prudent webmaster. The valid attribute is as follows:

cgi

The value specifies a %-encoded URL relative path to the CGI script. If the path does not begin with a slash, it is taken to be relative to the current document. The document referenced by this path is invoked as a CGI script, even if the server would not normally recognize it as such. However, the directory containing the script must be enabled for CGI scripts (with ScriptAlias or the ExecCGI option). The protective wrapper suEXEC will be applied if it is turned on. The CGI script is given the PATH_INFO and query string (QUERY_STRING) of the original request from the client; these cannot be specified in the URL path. The include variables will be available to the script in addition to the standard CGI environment. If the script returns a Location header instead of output, this is translated into an HTML anchor. If Options IncludesNOEXEC is set in the Config file, this command is turned off. The include virtual element should be used in preference to exec cgi.

cmd

The server executes the given string using /bin/sh. The include variables are available to the command. If Options IncludesNOEXEC is set in the Config file, this is disabled and will cause an error, which will be written to the error log.

fsize

This command prints the size of the specified file, subject to the sizefmt format specification. The attributes are as follows:

file

The value is a path relative to the directory containing the current document being parsed.

virtual

The value is a %-encoded URL path relative to the document root. If it does not begin with a slash, it is taken to be relative to the current document.

flastmod

This command prints the last modification date of the specified file, subject to the timefmt format specification. The attributes are the same as for the fsize command.

include

This command includes other files immediately at that point in parsing — right there and then, not later on. Any included file is subject to the usual access control. If the directory containing the parsed file has Options IncludesNOEXEC set and including the document causes a program to be executed, it isn't included: this prevents the execution of CGI scripts. Otherwise, CGI scripts are invoked as normal using the complete URL given in the command, including any query string.

An attribute defines the location of the document; the inclusion is done for each attribute given to the include command. The valid attributes are as follows:

file

The value is a path relative to the directory containing the current document being parsed. It can't contain ../, nor can it be an absolute path. The virtual attribute should always be used in preference to this one.

virtual

The value is a %-encoded URL relative to the document root. The URL cannot contain a scheme or hostname, only a path and an optional query string. If it does not begin with a slash, then it is taken to be relative to the current document. A URL is constructed from the attribute's value, and the server returns the same output it would have if the client had requested that URL. Thus, included files can be nested. A CGI script can still be run by this method even if Options IncludesNOEXEC is set in the Config file. The reasoning is that clients can run the CGI anyway by using its URL as a hot link or simply by typing it into their browser; so no harm is done by using this method (unlike cmd or exec).

14.1 File Size

The fsize command allows you to report the size of a file inside a document. The file size.shtml is as follows:

<!--#config errmsg="Bungled again!"-->
<!--#config sizefmt="bytes"-->
The size of this file is <!--#fsize file="size.shtml"--> bytes.
The size of another_file is <!--#fsize file="another_file"--> bytes.

The first line provides an error message. The second line means that the size of any files is reported in bytes printed as a number, for instance, 89. Changing bytes to abbrev gets the size in kilobytes, printed as 1k. The third line prints the size of size.shtml itself; the fourth line prints the size of another_file. config commands must appear above commands that might want to use them.

You can replace the word file= in this script, and in those which follow, with virtual=, which gives a %-encoded URL path relative to the document root. If it does not begin with a slash, it is taken to be relative to the current document.

If you play with this stuff, you find that Apache is strict about the syntax. For instance, trailing spaces cause an error because valid filenames don't have them:

The size of this file is <!--#fsize file="size.shtml   "--> bytes.
The size of this file is Bungled again! bytes.

If we had not used the errmsg command, we would see the following:

...[an error occurred while processing this directive]...

14.2 File Modification Time

figs/win32.gif

The last modification time of a file can be reported with flastmod. This lets the client know how fresh the data is that you are offering. The format of the output is controlled by the timefmt attribute of the config element. The default rules for timefmt are the same as for the C-library function strftime( ), except that the year is now shown in four-digit format to cope with the Year 2000 problem. Win32 Apache is soon to be modified to make it work in the same way as the Unix version. Win32 users who do not have access to Unix C manuals can consult the FreeBSD documentation at http://www.freebsd.org, for example:

figs/win32.gif

% man strftime

figs/win32.gif

(We have not included it here because it may well vary from system to system.)

The file time.shtml gives an example:

<!--#config errmsg="Bungled again!"-->
<!--#config timefmt="%A %B %C, the %jth day of the year, %S seconds 
    since the  Epoch"-->
The mod time of this file is <!--#flastmod virtual="size.shtml"-->
The mod time of another_file is <!--#flastmod virtual="another_file"-->

This produces a response such as the following:

The mod time of this file is Tuesday August 19, the 240th day of the year, 841162166 
seconds since the Epoch The mod time of another_file is Tuesday August 19, the 240th 
day of the year, 841162166 seconds since the Epoch

14.3 Includes

We can include one file in another with the include command:

<!--#config errmsg="Bungled again!"-->
This is some text in which we want to include text from another file:
&lt;&lt; <!--#include virtual="another_file"--> &gt;&gt;
That was it.

This produces the following response:

This is some text in which we want to include text from another file:
<< This is the stuff in 'another_file'. >>
That was it.

14.4 Execute CGI

We can have a CGI script executed without having to bother with AddHandler, SetHandler, or ExecCGI. The file exec.shtml contains the following:

<!--#config errmsg="Bungled again!"-->
We're now going to execute 'cmd="ls -l"'':
<< <!--#exec cmd="ls -l"--> >>
and now /usr/www/APACHE3/cgi-bin/mycgi.cgi:
<< <!--#exec cgi="/cgi-bin/mycgi.cgi"--> >>
and now the 'virtual' option:
<< <!--#include virtual="/cgi-bin/mycgi.cgi"--> >>
That was it.

There are two attributes available to exec: cgi and cmd. The difference is that cgi needs a URL (in this case /cgi-bin/mycgi.cgi, set up by the ScriptAlias line in the Config file) and is protected by suEXEC if configured, whereas cmd will execute anything.

There is a third way of executing a file, namely, through the virtual attribute to the include command. When we select exec.shtml from the browser, we get this result:

We're now going to execute 'cmd="ls -l"'':
<< total 24
-rw-rw-r--  1 414  xten   39 Oct  8 08:33 another_file
-rw-rw-r--  1 414  xten  106 Nov 11  1997 echo.shtml
-rw-rw-r--  1 414  xten  295 Oct  8 10:52 exec.shtml
-rw-rw-r--  1 414  xten  174 Nov 11  1997 include.shtml
-rw-rw-r--  1 414  xten  206 Nov 11  1997 size.shtml
-rw-rw-r--  1 414  xten  269 Nov 11  1997 time.shtml
 >>
and now /usr/www/APACHE3/cgi-bin/mycgi.cgi:
<< Have a nice day
 >>
and now the 'virtual' option:
<< Have a nice day
 >>
That was it.

A prudent webmaster should view the cmd and cgi options with grave suspicion, since they let writers of SSIs give both themselves and outsiders dangerous access. However, if he uses Options +IncludesNOEXEC in conf/httpd2.conf, stops Apache, and restarts with ./go 2, the problem goes away:

We're now going to execute 'cmd="ls -l"'':
<< Bungled again! >>
and now /usr/www/APACHE3/cgi-bin/mycgi.cgi:
<< Bungled again! >>
and now the 'virtual' option:
<< Have a nice day
 >>
That was it.

Now, nothing can be executed through an SSI that couldn't be executed directly through a browser, with all the control that this implies for the webmaster. (You might think that exec cgi= would be the way to do this, but it seems that some question of backward compatibility intervenes.)

Apache 1.3 introduced the following improvement: buffers containing the output of CGI scripts are flushed and sent to the client whenever the buffer has something in it and the server is waiting.

14.5 Echo

Finally, we can echo a limited number of environment variables: DATE_GMT, DATE_LOCAL, DOCUMENT_NAME, DOCUMENT_URI, and LAST_MODIFIED. The file echo.shtml is as follows:

Echoing the Document_URI <!--#echo var="DOCUMENT_URI"-->
Echoing the DATE_GMT <!--#echo var="DATE_GMT"-->

and produces the response:

Echoing the Document_URI /echo.shtml
Echoing the DATE_GMT Saturday, 17-Aug-96 07:50:31 

14.6 Apache v2: SSI Filters

Apache v2, with its filter mechanism, introduced some new SSI directives:

SSIEndTag  

SSIEndTag tag 
Default: SSIEndTag " -- >" 
Context: Server config, virtual host 
 

This directive changes the string that mod_include looks for to mark the end of an include element.

Example

SSIEndTag "%>"  

See also SSIStartTag.

SSIErrorMsg  

SSIErrorMsg message 
Default: SSIErrorMsg "[an error occurred while processing this directive]" 
Context: Server config, virtual host, directory, .htaccess 
 

The SSIErrorMsg directive changes the error message displayed when mod_include encounters an error. For production servers you may consider changing the default error message to "<!-- Error -->" so that the message is not presented to the user. This directive has the same effect as the <!--#config errmsg="message" --> element.

Example

SSIErrorMsg "<!-- Error -->"  
SSIStartTag  

SSIStartTag message
Default: SSIStartTag "<! -- " 
Context: Server config, virtual host 
 

This directive changes the string that mod_include looks for to mark an include element to process. You may want to use this option if you have two servers parsing the output of a file each processing different commands (possibly at different times).

Example

SSIStartTag "<%"  

This example, in conjunction with a matching SSIEndTag, will allow you to use SSI directives as shown in the following example (SSI directives with alternate start and end tags):

<%#printenv %>  

See also SSIEndTag.

SSITimeFormat  

SSITimeFormat formatstring 
Default: SSITimeFormat "%A, %d-%b-%Y %H:%M:%S %Z" 
Context: Server config, virtual host, directory, .htaccess 
 

This directive changes the format in which date strings are displayed when echoing DATE environment variables. The formatstring is as in strftime(3) from the C standard library.

This directive has the same effect as the <!--#config timefmt="formatstring" --> element.

Example

SSITimeFormat "%R, %B %d, %Y"  

The previous directive would cause times to be displayed in the format "22:26, June 14, 2002".

SSIUndefinedEcho  

SSIUndefinedEcho tag 
Default: SSIUndefinedEcho "<! --  undef  --
>" 
Context: Server config, virtual host 
 

This directive changes the string that mod_include displays when a variable is not set and "echoed."

Example

SSIUndefinedEcho "[ No Value ]"  
XBitHack  

XBitHack on|off|full 
Default: XBitHack off 
Context: Server config, virtual host, directory, .htaccess 
 

The XBitHack directive controls the parsing of ordinary HTML documents. This directive only affects files associated with the MIME type text/html. XBitHack can take on the following values:

off

This offers no special treatment of executable files.

on

Any text/html file that has the user-execute bit set will be treated as a server-parsed HTML document.

full

As for on but also test the group-execute bit. If it is set, then set the Last-modified date of the returned file to be the last modified time of the file. If it is not set, then no last-modified date is sent. Setting this bit allows clients and proxies to cache the result of the request.

You would not want to use the full option unless you assure the group-execute bit is unset for every SSI script that might include a CGI or otherwise produces different output on each hit (or could potentially change on subsequent requests).

XSSI  

   

This is an extension of the standard SSI commands available in the XSSI module, which became a standard part of the Apache distribution in Version 1.2. XSSI adds the following abilities to the standard SSI:

CONTENTS